2 issues here #277

eric1hello · 2023-04-27T05:45:43Z

I found 2 issues here:
---- shader.cc
I think the pI1 shall be pI2.
if ((pI1->oprnd_type == INT_OP) || (pI1->oprnd_type == UN_OP)) { //these counters get added up in mcPat to compute scheduler power
m_stats->m_num_INTdecoded_insn[m_sid]++;
---- I tried to enable cooperative_groups in bellow cuda, but seem it doesn't work , something issue with PTX, do you know the reasons?

device int atomicAggInc(int *ptr) {
auto g = cg::coalesced_threads();
int prev;

if (g.thread_rank() == 0)
prev = atomicAdd(ptr,g.size());

prev = g.thread_rank() + g.shfl(prev,0);

return prev;

}

global void vectorAdd(float *A, const float *B, float *C , int numElements) {

int i = blockDim.x * blockIdx.x + threadIdx.x;

//if (i < numElements) {
// C[i] = A[i] + B[i];
//}
if ( i%10 == 0){
int rankIdx = atomicAggInc(&count);
printf ("blockIdx = %d, threadIdx = %d, rank = %d \n",blockIdx.x ,threadIdx.x,rankIdx);
}
}

Fix for bugs in lazy write handling

added MSHR_HIT

change address type into ull

do not truncate 32 MSB bits of the memory address

…tion into HEAD

bug fix was_writeback_sent()

Fix cache hash function and renaming

…tion into HEAD

shared mem bank conflicts

Fixed an outdated comment line

Update gpgpusim.config

Adding GitHub CI

rename ci tests

Adding a non-zero return on error

* migrate_cmake: add package dependency checking * migrate_cmake: port setup_environment to CMake * migrate_cmake: break dependency checking and env export gen to different .cmake files * migrate_cmake: use CUDAToolkit_FOUND to test for CUDA compiler * migrate_cmake: use CUDAToolkit_FOUND to test for CUDA compiler * migrate_cmake: use CUDAToolkit_FOUND to test for CUDA compiler * migrate_cmake: properly parse for cuda version number * migrate_cmake: set highest CUDA supported to be 11.10.x * migrate_cmake: specify top level CMake file * migrate_cmake: add libcuda cmake file * migrate_cmake: use global compiler options and definitions * migrate_cmake: add cmake file to src * migrate_cmake: add cmake files for cuda-sim folder * migrate_cmake: add cmake files to gpgpu-sim folder * migrate_cmake: add cmake files for intersim * migrate_cmake: add short test using cmake * migrate_cmake: bump CXX standard requirement to 17 * Add cmake files for accelwattch * migrate_cmake: remove use of GLOB to grab source files * migrate_cmake: comment out the write protection on generated instructions.h * migrate_cmake: create sym folder and add newline to generated setup file * migrate_cmake: fix some path issues * migrate_cmake: let cmake thinks flex and bison generate CXX files * migrate_cmake: fix not linking pthread properly * migrate_cmake: remove debug message * migrate_cmake: add empty libopencl cmake file * migrate_cmake: install phase and runtime version detect * Added install phase to install the shared object and add symlinks * Changes with CUDA toolkit will be detected and triggered a rebuild * GPGPU-Sim detailed version string will be updated on each build * Typo fix and fix correct bin dir * Replace gcc -> g++ in intersim * ignore setup * check CMAKE_BUILD_TYPE * set DCMAKE_BUILD_TYPE --------- Co-authored-by: JRPAN <[email protected]>

CMAKE_BUILD_TYPE should be inside ${}

LDGSTS/LDGDEPBAR was introduced #62, but it's increment part was deleted by mistake. So add it. In some applications, ldgsts may not exist between ldgdepbar. In such cases, add exception handling logic to insert an empty vector. Reported-by: Okkyun Woo <[email protected]> Signed-off-by: Wonhyuk Yang <[email protected]>

* remove implicit casting, cleanup unused bank_warp_shift parameter * update cu init function prototype * remove m_bank_warp_shift from function call

* add automated clang formatter * Automated clang-format * use /bin/bash and add print * use default checkout ref * Format only after tests are success * Run CI on merge group --------- Co-authored-by: barnes88 <[email protected]> Co-authored-by: JRPAN <[email protected]>

* run formatter only on PR * remove unused & unintilized variables * fix signed & unsigned comparison warning * enable merge queue * resolve conflict * in formatter, checkout the forked repo, not the base repo in PR * Try to use jenkins for formatter * Automated Format --------- Co-authored-by: purdue-jenkins <[email protected]>

* Temp commit for Justin and Cassie to sync on code changes for adding per-stream status. * Resolved compile errors. * Removed redundant parameter * Passed cuda_stream_id from accelsim to gpgpusim * Cleaned up unused changes * Changed vector to map, having operator problems. * StreamID defaults to zero * Implemented streams to inc_stats and so on * Fixed TOTAL_ACCESS counts * Implemented GLOBAL_TIMER. * Fixed m_shader->get_kernel SEGFAULT issue in shader.cc. * Use warp_init to track streamID instead of issue_warp * Removed temp debug print * Modified cache_stats to only print data from latest finished stream Added optional arg to cache_stats::print_stats, cache_stats::print_fail_stats and their upstream functions. When streamID is specified, print stats from that stream. When not specified, print all stats. NOTE: current implementation depending on streamid never equals -1 * Removed default arg values of streamID * modified constructor of mem_fetch to pass in streamID * changed get_streamid to get_streamID * Added TODO to gpgpusim_entrypoint.cc and power_stat.cc * Only collect power stats when enabled * print last finished stream in PTX mode using last_streamID * take out additional printf * Add a field to baseline cache to indicate cache level * save gpu object in cache * Print stream ID only once per kernel * rm test print * use -1 for default stream id * cleanup debug prints * remove GLOABL_TIMER * Automated clang-format * Should be correct to print everything in power model * addressing concerns & errors * Automated clang-format * add m_stats_pw in operator+ * Automated Format --------- Co-authored-by: Justin Qiao <[email protected]> Co-authored-by: Justin Qiao <[email protected]> Co-authored-by: Tim Rogers <[email protected]> Co-authored-by: JRPan <[email protected]> Co-authored-by: purdue-jenkins <[email protected]>

…#78)

* we have gcc-11 now. Check version for more than 2 digits. * version detection as well - And support c++ 11 by default

gvoskuilen and others added 30 commits October 26, 2020 15:19

Fix for bugs in lazy write handling

b7b9dc0

Merge pull request #3 from gvoskuilen/dev

2d73b61

Fix for bugs in lazy write handling

change address type into ull

950464e

do not truncate 32 MSB bits of the memory address

07f77e1

added MSHR_HIT

132c2ce

Merge pull request #6 from JRPan/add_mshr

85e36b9

added MSHR_HIT

Merge pull request #4 from allencho1222/patch-1

29cce50

change address type into ull

Merge pull request #5 from allencho1222/patch-2

e6b0608

do not truncate 32 MSB bits of the memory address

Merge branch 'dev' of https://github.com/accel-sim/gpgpu-sim_distribu…

5ac0b60

…tion into HEAD

bug fix was_writeback_sent

f3a0077

Merge pull request #7 from JRPan/fix-was_writeback_sent

67f89ab

bug fix was_writeback_sent()

fix hash funciton

51d9925

Merge pull request #9 from JRPan/fix-cache-hash

2f96645

Fix cache hash function and renaming

adding new RTX 3070 config

b430b36

Merge branch 'dev' of https://github.com/accel-sim/gpgpu-sim_distribu…

deb5eb5

…tion into HEAD

change the L1 cache policy to be on-miss based on recent ubench

09f10eb

change the L1 cache policy based on recent ubench

1ee03f0

parition CU allocation, add prints

5533464

minor fixes

645a0ea

useful print statement

46423a2

validated collector unit partitioning based on scheduler

b672880

sub core model dispatches only to assigned exec pipelines

fa76ab4

minor fix accessing du

c905726

fix find_ready reg_id

a72b84e

dont need du id

6ad5bad

remove prints

9219236

need at least 1 cu per sched for sub_core model, fix find_ready() reg_id

52a890c

move reg_id calc to cu object init

2db9120

fix assert

4825a1d

clean up redundant method args

e2b410d

barnes88 and others added 30 commits August 15, 2023 16:14

Merge pull request #58 from barnes88/fix-stats

53e99da

shared mem bank conflicts

LDGSTS, LDGDEPBAR and DEPBAR Implementations (#62)

a0c12f5

Update gpgpusim.config

d09254e

Fixed an outdated comment line

Merge branch 'dev' into dev

2ef277b

Adding Github Actino CI

3c95cd1

Merge pull request #63 from accel-sim/FJShen-patch-1

9aeacdf

Update gpgpusim.config

Merge branch 'dev' into dev

a06ebf7

update CI scripts

b2f0ebe

uses actions/checkout@v4

2bbfb8b

Merge branch 'dev' into dev-github-ci

291fb11

fix dubious ownership

77aefac

remove fermi and add newer gen cards

1bdb39a

Merge pull request #64 from accel-sim/dev-github-ci

67fc78c

Adding GitHub CI

rename ci tests

d935bd1

Merge pull request #65 from accel-sim/dev-ci

b1ff53d

rename ci tests

Merge branch 'dev' into dev

0b93c15

Merge pull request #52 from tgrogers/dev

6389301

Adding a non-zero return on error

CMAKE_BUILD_TYPE should be inside ${}

b70d930

Fix Build Type

570d75c

Merge pull request #68 from JRPan/dev

7dc9977

CMAKE_BUILD_TYPE should be inside ${}

Added guard to check if L2 is writeback or not (#73)

6aa7ed1

Reg bank patch (#41)

55419d7

* remove implicit casting, cleanup unused bank_warp_shift parameter * update cu init function prototype * remove m_bank_warp_shift from function call

Add support for SHF ptx instruction (#70)

081da0a

Change to calculate L2 BW if core freq and icnt freq are not the same (…

980eb88

…#78)

we have gcc-11 now. Check version for more than 2 digits. (#79)

667834c

* we have gcc-11 now. Check version for more than 2 digits. * version detection as well - And support c++ 11 by default

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2 issues here #277

2 issues here #277

eric1hello commented Apr 27, 2023

2 issues here #277

Are you sure you want to change the base?

2 issues here #277

Conversation

eric1hello commented Apr 27, 2023